8.2-4
Let $A[1...n]$ be input integers in the range $0$ to $k$.
Design:
Preprocess($A$):
Create array $R[-1...k]$ which all values are $0$
for $i$ from $1$ to $n$:
$R[ A[i] ] := R[ A[i] ] + 1$
for $j$ from $0$ to $k$:
$R[j] := R[j-1] + R[j]$
Return $R$
$c$ $:=$ Preprocess($A$)
Query($a$, $b$):
Return $c[b] - c[a-1]$
Correctness:
Firstly, we are gonna consider Preprocess. Observe that the first for-loop will never access $R$ out of range, since $0 \leq A[i] \leq k$ for all $1 \leq i \leq n$.
When the first for-loop terminates, it's easy to see that $R[j]$ records how many elements in $A$ equals to $j$.
To complete the proof, we have to prove $R[j] = \sharp \{ A[i] : A[i] \leq j \}$. $(*)$
Because $(*)$ is not so obvious, we are gonna use loop invariant to prove it.
Loop Invariant: At the start of the $l$-th iteration, $(*)$ is true for all $-1 \leq j < l$.
Initialization: When the first iteration starts, $R[-1]$ is $0$, so loop invariant holds.
Maintenance:
Suppose loop invariant holds when the $l$-th iteration starts. Observe that $R[j]$ is unmodified for all $l \leq j \leq k$ until the start of current iteration.
It implies that $R[l]$ stores how many $l$'s in $A[1...n]$. Therefore, when the current iteration ends, $R[l]$ equals to $\sharp \{ A[i] : A[i] \leq l \}$, and $(*)$ is true for all $-1 \leq j \leq l$.
Termination: As our discussion in previous part, when the final iteration ends, $(*)$ is true for all $-1 \leq j \leq k$.
The correctness of Query follows immediately from $(*)$.
Analysis:
Hence, the time consumption of Preprocess is $T(n,k) = (k+2)t_1 + n t_2 + (k+1)t_3 + t_4$.
And we have $\min\{ t_1, t_2, t_3\} \cdot (n + k) \leq T(n,k) \leq 2\max\{ t_1, t_2, t_3\} \cdot (n + k)$. Therefore, $T(n,k) = \Theta(n+k)$.
8-5
a.
If $k=1$, then $k$-sorted implies that $$A[i] = \sum_{j=i}^{i+1-1}{A[j]} \leq \sum_{j=i+1}^{i+1}{A[j]} = A[i+1]$$ for all $i = 1, 2, ..., n-1$. $\implies$ $A$ is sorted.
b.
$1$, $6$, $2$, $7$, $3$, $8$, $4$, $9$, $5$, $10$
c.
Suppose $A$ is $k$-sorted. Then
$$\frac{\sum_{j=i}^{i+k-1}{A[j]}}{k} \leq \frac{\sum_{j=i+1}^{i+k}{A[j]}}{k} \iff \sum_{j=i}^{i+k-1}{A[j]} \leq \sum_{j=i+1}^{i+k}{A[j]} \iff 0 \leq A[i+k] - A[i] \iff A[i] \leq A[i+k]$$
for all $i = 1, 2, ..., n-1$.
d.
As our disscussion in c. An array $A$ is $k$-sorted if and only if $A[i] \leq A[i+k]$ for all $i = 1, 2, ..., n-1$.
Algorithm($A$, $n$, $k$):
Partition the $n$ elements into $k$ groups of $\lceil \frac{n}{k} \rceil$ elements each($G_j[1 ... \lceil \frac{n}{k} \rceil]$, $j=1, ... , k-1$),
and the last group made up of $n - (k-1)\lceil \frac{n}{k} \rceil$ elements.($G_j[1 ... (k-1)\lceil \frac{n}{k} \rceil]$, $j = k$)
for $j$ from $1$ to $k$:
Sort $G_j$
for $j$ from $1$ to $k$:
for $l$ from $1$ to $G_j.length$:
$A[k(j-1)+l] := G_j[l]$
Evaluate the time complexity of our algorithm,
(i) partition part takes $O(n)$ time
(ii) the first for-loop takes $O(k \cdot \lceil \frac{n}{k} \rceil \lg {\lceil \frac{n}{k} \rceil}) = O(n \lg{\frac{n}{k}})$
(iii) the second for-loop takes $O(n)$ since there are $n$ elements in total.
Hence, the time complexity is $O(n \lg{\frac{n}{k}})$.
f.
Let $k$ be constant. By our disscussion in c. , we have to sort a subsequence $A[1+0k]$, $A[1+1k]$, $...$ , $A[1+(\lfloor \frac{n}{k} \rfloor -1)k]$, and it takes $\Omega(\lfloor \frac{n}{k} \rfloor \lg{\lfloor \frac{n}{k} \rfloor}) = \Omega(n \lg n)$ since $k$ is constant. This completes the proof.
9.3-1
For groups of 7:
Consider the same algorithm by replacing with "groups of 7", the number of elements $\geq x$ is $4( \lceil \frac{1}{2} \lceil \frac{n}{7} \rceil \rceil -2 ) \geq \frac{2n}{7} -8$. Then we have the recurrence relation of time consumption:
$$T(n) \leq T(\lceil \frac{n}{7} \rceil) + T(\frac{5n}{7} + 8) + O(n)$$
Prove $T(n) = O(n)$ by substitution method:
Guess $T(n) = O(n)$, then by induction hypothesis, we have
$T(n) \leq c \lceil \frac{n}{7} \rceil + c(\frac{5n}{7} + 8) + an$
$\leq c \frac{n}{7} + c + c(\frac{5n}{7} + 8) + an$
$\leq (\frac{6c}{7}+a)n + 9c$
$\implies$ By choosing $c > 14a$ and $n \geq n_0 \geq \frac{9c}{14}$, therefore we have $T(n) \leq cn$ and hence $T(n) = O(n)$ by induction.
For groups of 3:
Suppose $T(n) = O(n)$, then $T(n) \leq T(\lceil \frac{n}{3} \rceil) + T(\frac{2n}{3} + 4) + O(n) \leq c \frac{n}{3} + c + c(\frac{2n}{3} + 4) + an$ is impossible to be $\leq cn$.
9.3-8
Design:
Algorithm($X$, $Y$, $n$):
$p_x := 1$, $p_y := 1$
$q_x := n$, $q_y := n$
while( True ):
$m_x := \lfloor \frac{p_x+q_x}{2} \rfloor$, $m_y := \lfloor \frac{p_y+q_y}{2} \rfloor$
if $X[m_x] > Y[m_y]$:
if $p_x$ == $q_x$:
Return $Y[m_y]$
$q_x := m_x$
$p_y := m_y$
else:
if $p_y$ == $q_y$:
Return $X[m_x]$
$p_x := m_x$
$q_y := m_y$
Correctness:
We are gonna use loop invariant to prove correctness.
Loop Invariant:
At the start of each iteration, the median(we will use "the Median" for latter discussion) of all $2n$ elements in arrays $X$ and $Y$ is contained in $X[p_x...q_x] \cup Y[p_y...q_y]$. $(*)$
Initialization:
When the while-loop starts, $X[p_x...q_x] = X[1...n]$ and $Y[p_y...q_y] = Y[1...n]$. $\implies (*)$ holds.
Maintenance:
Suppose $(*)$ holds for current iteration.
$\because$ $X$ and $Y$ are sorted.
$\therefore$ Any subarray of $X$, $Y$ are sorted also.
Note that the Median must lie between the median of $X[p_x...q_x]$ and the median of $Y[p_y...q_y]$,
ie. $\min\{ X[\lfloor \frac{p_x+q_x}{2} \rfloor], Y[\lfloor \frac{p_y+q_y}{2} \rfloor] \} \leq$ the Median $\leq \max\{ X[\lfloor \frac{p_x+q_x}{2} \rfloor], Y[\lfloor \frac{p_y+q_y}{2} \rfloor] \}$.
Suppose $(*)$ holds for current iteration, that is, the Median is contained in $X[p_x...q_x] \cup Y[p_y...q_y]$.
If $X[m_x] > Y[m_y]$, then the removal of $Y[p_y ... m_y-1] \cup X[m_x+1 ... q_x]$ keeps $(*)$ hold for next iteration.
If $X[m_x] \leq Y[m_y]$, then the removal of $X[p_x ... m_x-1] \cup Y[m_y+1 ... q_y]$ keeps $(*)$ hold for next iteration.
Termination:
Note that $p_x$ == $q_x \iff p_y$ == $q_y$ since $q_x-p_x$ always equals to $q_y-p_y$.
Therefore, if $p_x$ == $q_x$ or $p_y$ == $q_y$, then it only remains $X[p_x]$ and $Y[p_y]$ two elements, then we return the smaller one.
Because it has $2n$ elements and we removes even number elements in each iteration, it must terminate and remain only two elements in the final iteration, the smaller one must be the Median.
Analysis:
In each iteration, $q_x-p_x$ == $q_y-p_y$, it removes $\lfloor\frac{q_x - p_x}{2}\rfloor + \lceil\frac{q_y - p_y}{2}\rceil = \lfloor\frac{q_y - p_y}{2}\rfloor + \lceil\frac{q_x - p_x}{2}\rceil = q_x - p_x = q_y - p_y$ elements of $X[p_x...q_x] \cup Y[p_y...q_y]$.
$\implies$ It remains half of remainder of previous iteration.
$\implies$ Since the time consumption of each iteration is constant, the algorithm's time complexity is $O(\lg n)$.